Word classification based on phonetic features

ABSTRACT

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining a textual term; determining, by one or more computers, a vector representing a phonetic feature of the textual term; comparing the vector representing the phonetic feature of the textual term with a reference vector representing a phonetic feature of a reference textual term; and classifying the textual term based on the comparing the vector with the reference vector.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. Provisional Application Ser.No. 62/042,671, filed on Aug. 27, 2014, which is incorporated byreference.

BACKGROUND

This specification relates to using features of a textual term toclassify the textual term.

SUMMARY

According to one innovative aspect of the subject matter described inthis specification, phonetic features of textual terms such aspronunciation of the textual terms may be generated. A model forprocessing textual terms may be trained using the phonetic features. Themodel may be used to generate a representation of the phonetic features.The representation of the phonetic features of a textual term may beused to classify the textual term. The representation of the phoneticfeatures of a textual term may be compared with a representation ofphonetic features of another textual term to determine a relationshipbetween the two textual terms. The phonetic features may be used augmenta word-based model to provide a more comprehensive representation fortextual terms.

In general, one innovative aspect of the subject matter described inthis specification can be embodied in methods that include the actionsof obtaining a textual term; determining, by one or more computers, avector representing a phonetic feature of the textual term; comparingthe vector representing the phonetic feature of the textual term with areference vector representing a phonetic feature of a reference textualterm; and classifying the textual term based on the comparing the vectorwith the reference vector.

These and other implementations can each optionally include one or moreof the following features. Obtaining the textual term may includeobtaining the textual term from a resource stored at a remote computer.Obtaining the textual term may include obtaining the textual term from asearch query. Determining the vector representing the phonetic featureof the textual term may include determining a pronunciation of thetextual term.

The textual term may include a plurality of characters. Determining thevector representing the phonetic feature of the textual term may includedetermining a first vector representing a phonetic feature of a subsetof the plurality of characters of the textual term. Classifying thetextual term may include determining that the subset of the plurality ofcharacters of the textual term is similar to the reference textual termbased on the comparing the vector with the reference vector; and inresponse to determining that the subset of the plurality of charactersof the textual term is similar to the reference textual term,associating a definition of the reference textual term to the subset ofthe plurality of characters of the textual term.

The actions may include determining a second vector representing aphonetic feature of a second subset of the plurality of characters ofthe textual term; comparing the second vector with a second referencevector representing a phonetic feature of a second reference textualterm; determining that the second subset of the plurality of charactersis similar to the second reference textual term based on comparing thesecond vector with the second reference vector; and in response todetermining that the second subset of the plurality of characters issimilar to the second reference textual term, associating a definitionof the second reference textual term to the second subset of theplurality of characters of the textual term.

Comparing the vector with the reference vector may include determining acosine distance between the vector and the reference vector.

Classifying the textual term may include determining that the cosinedistance is within a specific distance; and in response to determiningthat the cosine distance is within the specific distance, classifyingthe textual term as being similar to the reference textual term.

Classifying the textual term may include determining a likelihood thatthe textual term is similar to the reference textual term; andclassifying the textual term based on the likelihood at the textual termis similar to the reference textual term.

Classifying the textual term may include determining that the textualterm is similar to the reference textual term; and in response todetermining that the textual term is similar to the reference textualterm, associating a definition of the reference textual term to thetextual term.

Obtaining the textual term may include obtaining one or more textualterms that are surrounding the textual term. Determining the vectorrepresenting the phonetic feature of the textual term may includedetermining the vector using (i) the phonetic feature of the textualterm and (ii) the one or more textual terms that are surrounding thetextual term.

Advantageous implementations may include one or more of the followingfeatures. A system can use phonetic features to classify certain typesof textual terms that may not be classified using textual features, suchas cognate words, invented words, lemmatization of words, wordmisspelling, hashtags, or alternative spelling of words. A system canuse phonetic features to derive semantic relationships between wordsthat are similar in sounds and concept but not in usage. A system canuse phonetic features to augment a word-based model to provide a morecomprehensive representation for any objects of interests includingtextual terms.

The details of one or more implementations are set forth in theaccompanying drawings and the description below. Other potentialfeatures and advantages will become apparent from the description, thedrawings, and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an example system.

FIG. 2 is a block diagram of an example system that uses phoneticfeatures of a textual term to augment a word-based model.

FIG. 3 is a block diagram of an example system for training a model.

FIG. 4 is a flow chart illustrating an example process for classifying atextual term using phonetic features.

Like reference numbers and designations in the various drawings indicatelike elements.

DETAILED DESCRIPTION

FIG. 1 is a block diagram of an example system 100. The system 100 canbe used to classify a textual term using the phonetic features of thetextual term. A textual term is a sequence of one or more letters. Forexample, a textual term may include a word having a dictionarydefinition, a word without a dictionary definition, a compounded word, aconcatenation of textual characters, textual metadata embedded in aresource, or any textual representation of a textual or non-textualresource. The system 100 compares a vector representing the phoneticfeatures (e.g., the pronunciation of the textual term) of the textualterm with one or more vectors representing phonetic features of one ormore other textual terms to classify the textual term. Examples ofclassifying a textual term include determining a definition of thetextual term, determining other textual terms that are related to thetextual term, determining a relationship between the textual term andanother textual term, or determining a relationship between the textualterm and a non-textual object,

The system 100 includes a resource 104, a computing system 102, and aclassified textual term store 116. In general, the resource 104 may beany resource that is associated with a textual term 106. The computingsystem 120 may be a remote server system that receives the textual term106 as input, and based on phonetic features of the textual term 106,provides a classification of the textual term 106 as output. Thefunctions performed by the computing system 120 can be performed byindividual computer systems or can be distributed across multiplecomputer systems.

The computing system 120 includes a phonetic engine 108, a trained model118, and a classifier engine 112. The phonetic engine 108 derives one ormore phonetic features 110 of the textual term 106. The trained model118 generates a representation vector 120 that represents the phoneticfeatures of the textual term 106 in a vector space. The classifierengine 112 classifies the textual term 106 based on the representationvector 120. For example, the classifier engine 112 may compare therepresentation vector 120 with other representation vectors generated bythe trained model 118 in the vector space, and the classifier engine 112may classify the textual term 106 based on the comparison. Theclassifier engine 112 outputs a classified textual term 114 thatrepresents a classification of the textual term 106. The classifiedtextual term 114 is stored in the classified textual term store 116.

The resource 104, the computing system 102, and the classified textualterm store 116 may be communicatively coupled using a wired network orwireless network or a combination of both. The network can include, forexample, a wireless cellular network, a wireless local area network(WLAN) or Wi-Fi network, a Third Generation (3G) or Fourth Generation(4G) mobile telecommunications network, a wired Ethernet network, aprivate network such as an intranet, a public network such as theInternet, or any appropriate combination of networks.

FIG. 1 also illustrates an example flow of data, shown in stages (A) to(C). Stages (A) to (D) may occur in the illustrated sequence, or theymay occur in a suitable sequence that is different than in theillustrated sequence.

During stage (A), the computing system 120 obtains the textual term 106in the resource 104. In some implementations, the resource 104 may be atextual resource that includes the textual term 106. For example, theresource 104 may be a document in a corpus of documents, where thetextual term 106 is included as a word in the document. As anotherexample, the resource 104 may be the source code for a webpage, wherethe textual term 106 is included in the source code. In someimplementations, the resource 104 may be a non-textual resource that isrepresented by the textual term 106. For example, the resource 104 maybe an image, and the textual term 106 may be a caption of the image. Asanother example, the resource 104 may be a data object, where thetextual term 106 is metadata associated with the data object. In someimplementations, the resource 104 may be dynamically generated. Forexample, the resource 104 may be a search query, where the textual term106 is a search term in the search query. As another example, theresource 104 may be a trending search term, as determined by anothercomputing system.

During stage (B), the phonetic engine 108 determines the phoneticfeatures of the textual term 106, and generates phonetic features 110.In some implementations, the phonetic features 110 include a predictedpronunciation of the textual term. The pronunciation of the textual termcan be predicted using a number of suitable techniques. For example, thephonetic engine 108 may use a pronunciation dictionary to predict thepronunciation of the textual term 106. As another example, the phoneticengine 108 may use a set of acoustic rules to predict the pronunciationof the textual term 106. As another example, the phonetic engine 108 mayinput the textual term 106 to a trained acoustic model to generate apredicted pronunciation of the textual term 106. In someimplementations, the phonetic features 110 include an acoustic signalrepresenting the predicted pronunciation of the textual term. Forexample, the phonetic engine 108 may input the textual term 106 to atrained text-to-speech model to generate the acoustic signalrepresenting the predicted pronunciation of the textual term 106.

In some implementations, the phonetic engine 108 separates the textualterm 106 into multiple textual terms, and determines the phoneticfeatures of each respective textual term. For example, for an inventedtextual term “CompuVehicle,” the phonetic engine 108 may separate thetextual term into two textual terms “Compu” and “Vehicle,” anddetermines the pronunciations of each of “Compu” and “Vehicle.” In someimplementations, the phonetic features 110 include the phonetic featuresof the textual term 106 and the phonetic features of the separatedtextual terms. For example, the phonetic features 110 of the textualterm “CompuVehicle” may include the pronunciations of “CompuVehicle,”“Compu,” “Vehicle,” “CompuVe,” “hicle,” and so on.

During stage (C), the trained model 118 determines a representationvector 120 based on the phonetic features 110. The representation vector120 may be mapped to a vector space that includes reference vectorsrepresenting phonetic features of known textual terms. In someimplementations, the trained model 118 is trained using these knowntextual terms, and the reference vectors are generated by the trainedmodel 118.

In some implementations, the trained model 118 includes an embeddingfunction that receives the phonetic features 110 and, in accordance witha set of embedding function parameters, applies a transformation to thephonetic features 110 that maps the phonetic features 110 into acontinuous high-dimensional numeric representation. For example, theembedding function can apply a transformation to the phonetic features110 to map the phonetic features 110 into a floating pointrepresentation vector 120. In some implementations, the trained model118 includes a neural network model that receives the phonetic features110 and processes the phonetic features 110 through multiple hiddenlayers to generate the representation vector 120.

In cases where the phonetic features 110 include the phonetic featuresof the textual term 106 and the phonetic features of the separatedtextual terms, the trained model 118 may determine a respectiverepresentation vector for each of the textual terms. For example, forthe textual term “CompuVehicle” described above, the trained model 118may determine a respective representation vector for each of the textualterms “CompuVehicle,” “Compu,” “Vehicle,” “CompuVe,” “hicle,” and so on.

During stage (D), the classifier engine 112 classifies the textual term106 using the representation vector 120, and stores the classifiedtextual term 114 in the classified textual term store 116. Theclassifier 110 can be any multiclass or multilabel classifier, e.g., amulticlass logistic regression classifier, a multiclass support vectormachine classifier, or a Bayesian classifier.

The classifier engine 112 may compare the representation vector 120 withone or more reference vectors in the vector space. In someimplementations, the classifier engine 112 determines a distance, e.g.,a cosine distance, between the representation vector 120 and one or morereference vectors in the vector space, and classifies the textual term106 based on the cosine distance. If the cosine distance between therepresentation vector 120 and a particular reference vector is within aspecific vector distance, the classifier engine 112 may classify thatthe textual term 106 is related to the textual term associated with theparticular reference vector.

For example, the classifier engine 112 may determine that the cosinedistance between the representation vector of “CompuVehicle” is notwithin the specific vector distance with any reference vectors in thevector space. However, the classifier engine 112 may determine that thecosine distance between the representation vector of the separatedtextual term “Compu” is within the specific vector distance with thereference vector of a known textual term “Computer,” and the classifierengine 112 may classify that the separated textual term “Compu” isrelated to the known textual term “Computer.” As another example, theclassifier engine 112 may determine that the representation vector ofthe separated textual term “Vehicle” is numerically equal to thereference vector of a known textual term “Vehicle,” and the classifierengine 112 may classify that the separated textual term “Vehicle” isequivalent of the known textual term “Vehicle.”

As another example, a resource 104 may include a textual term 106“Vehecle” that is an unintended spelling error of a known textual term“Vehicle.” The classifier engine 112 may determine that therepresentation vector of the textual term 106 “Vehecle” is within thespecific vector distance to the reference vector of the known textualterm “Vehicle,” and the classifier engine 112 may classify that thetextual term 106 “Vehecle” is related to the known textual term“Vehicle” because their phonetic features (e.g., pronunciations) aresimilar. Consequently, even textual terms with spelling errors can stillbe useful for subsequent analysis.

In some implementations, the classified textual term 114 is a datastructure indicating the classification of the textual term 106. Forexample, the classified textual term 114 for the textual term“CompuVehicle” may be an ordered list {“CompuVehicle”, “Computer”,“Vehicle”}, indicating that the classifier engine 112 has determinedthat the known textual terms “Computer” and “Vehicle” are related to thetextual term “CompuVehicle.”

In some implementations, the classified textual term 114 is representedby a word score vector, where each field of the word score vectorcorresponds to a respective known textual term in a set of known textualterms. For example, each field of the word score vector may include ascore indicating a probability that the definition of the textual term106 corresponds to the definition of the respective known textual term.The classifier engine 112 may determine that the textual term 106 issimilar to a particular known textual term if the particular knowntextual term has the highest score in the word score vector.

The classified textual term 114 stored in the classified textual termstore may be used to provide a definition for the textual term 106. Theclassified textual term 114 stored in the classified textual term storemay be used to provide a relationship between the textual term 106 andother known textual terms. The classified textual term 114 stored in theclassified textual term store may be used to provide a relationshipbetween the textual term 106 and other known textual terms. Theclassified textual term 114 stored in the classified textual term storemay be used to train an updated model for generating representationvectors. The classified textual term 114 stored in the classifiedtextual term store may be used as input to another computing system.

FIG. 2 is a block diagram of the example system 200 that can usephonetic features of a textual term to augment a word-based model. Ingeneral, a word-based model uses textual terms that surround aparticular textual term to classify the particular textual term. In someimplementations, a word-based model discards textual terms that thecomputing system cannot classify, and valuable information may be lostas the results. For example, a textual term that has been invented by anauthor may not be captured by a word-based model because the surroundingtextual terms may not provide sufficient information for theclassification of the invented textual term. Phonetic features may beused to provide a complementary signal to a word-based model, resultingin a more comprehensive vector space for classifying textual terms. Theexample system 200 includes a resource 104, a computing system 202, anda classified textual term store 116. The computing system 202 includes aphonetic engine 108, a trained model 218, and a classifier engine 212.

The resource 104 includes the textual term 106 and surrounding textualterms 204. For example, if the textual term 106 is at position t in asequence of textual terms, the surrounding textual terms 204 are knownwords at position t−N, . . . , t−1, t+1, . . . , t+N, respectively,where N is a predetermined integer value.

The trained model 218 uses one or both of the phonetic features 110 orthe surrounding textual terms 204 to determine a representation vector220 of the textual term 106. In some implementations, the trained model218 includes a word-based embedding function that receives a sequence oftextual terms and, in accordance with a set of embedding functionparameters, applies a transformation to the textual terms that maps thetextual terms into a continuous high-dimensional word-based vectorspace. For example, the word-based embedding function may be a combiningembedding function. A combining embedding function maps each textualterm in the sequence of textual terms to a respective continuoushigh-dimensional word-based vector space, for example, to a respectivehigh-dimensional vector of floating point numbers, based on currentparameter values of the embedding function, e.g., as stored in a lookuptable, and then merges the respective floating point vectors into asingle merged vector. The merged floating point representation vectorcan be used by the classifier engine 212 to classify the textual term106.

In some implementations, the trained model 218 uses the phoneticfeatures 110 to modify the floating point representation vector todetermine the representation vector 220. For example, the trained model218 may add one or more dimensions in the word-based vector space torepresent the phonetic features of textual terms. The trained model 218may map the phonetic features 110 to the additional dimensions, and thevalues in the additional dimensions may be used by the classifier engine212 to classify the textual term 106.

The classifier engine 212 uses the representation vector 220 todetermine a classified textual term 214. The classifier 212 receives therepresentation vector 220 generated by the trained model 218 andclassifies the textual term 106 similar to the classifier 112. In someimplementations, if the classifier engine 212 determines that the valuesin the word-based vector space cannot be used to classify the textualterm 106, the classifier engine 212 uses the values in the additionaldimensions corresponding to the phonetic features 110 to classify thetextual term 106. In some implementations, after the classifier engine212 classifies the textual term 106 using values in the word-basedvector space, the classifier engine 212 classifies the textual term 106using values in the additional dimensions corresponding to the phoneticfeatures 110, and compares the two classifications. The classifierengine 212 may select one or both classifications to be included in theclassified textual term 214.

FIG. 3 is a block diagram of an example system 300 that can train amodel. In general, a model such as the trained model 118 or the trainedmodel 218 may be trained in any number of ways. The example system 300illustrates parallel training of a model, where the trained model can beused in systems such as the example systems 100 and 200.

The model receives input and generates an output based on the receivedinput and on values of the parameters of the model. For example, a modelmay receive data identifying phonetic features of textual terms and,based on the phonetic features of the textual terms and on theparameters of the model, may generate a vector that represents a textualterm. The model may be composed of a single level of linear ornon-linear operations or may be composed of multiple levels of linear ornon-linear operations. An example of a model is a neural network withone or more hidden layers.

The model can be trained using training data, i.e., the training data intraining data database 312. The training data in the training datadatabase 312 are inputs for which the desired output, i.e., the outputthat should be generated by the model, is known. In order to train themodel, i.e., find optimal values of the model parameters, an objectivefunction is developed that is a measure of the performance of the modelon the set of training data as a function of the model parameters. Theoptimal values of the parameters of the model can then be found byfinding a minimum of the objective function. In particular, multipleiterations of a stochastic gradient descent procedure can then beperformed to find the optimal values of the parameters.

The example system 300 includes multiple model replicas 302 a-n. Each ofthe model replicas 302 a-n is an identical instance of a model and canbe implemented as one or more computer programs and data deployed to beexecuted on a computing unit. Advantageously, the computing units areconfigured so that they can operate independently of each other. In someimplementations, only partial independence of operation is achieved, forexample, because replica instances share some resources.

A computing unit may be, e.g., a computer, a core within a computerhaving multiple cores, or other hardware or software within a computercapable of independently performing the computation for the modelreplica. Each model replica 302 a-n operates independently from each ofthe other model replicas 302 a-n and is configured to communicate withthe training data database 312 and a parameter server 306 through anetwork, e.g., a local area network (LAN) or wide area network (WAN),e.g., the Internet, in order to compute delta values for the parametersof the model. A delta value for a parameter is a value that the replicahas determined is to be applied to the current value of the parameter sothat it approaches its optimal value.

The parameter server 306 maintains the current values of the parametersof the model and updates the values as the results of training areuploaded by the replicas. The functionality of the parameter server 306may be partitioned among multiple parameter server shards 310 a-k. Thatis, each of the parameter server shards 310 a-k maintains values of arespective subset of the parameters of the model, such that theparameters of the model are partitioned among the parameter servershards 310 a-k. Each parameter server shard 310 a-k is implemented on arespective independent computing unit. Advantageously, the computingunits are configured so that they can operate independently of eachother. In some implementations, only partial independence of operationis achieved, for example, because replica instances share someresources.

Each of the parameter server shards 310 a-k provides values ofparameters to the model replicas 302 a-n, receives delta values of theparameters from the model replicas 302 a-n, and updates stored values ofthe parameters based on the received delta values independently fromeach other parameter server shard.

FIG. 4 is a flow diagram that illustrates an example process 400 forclassifying a textual term using phonetic features. The process 400 maybe performed by data processing apparatus, such as the example system100 described above, the example system 200 described above, or anotherdata processing apparatus.

The system obtains a textual term (402). For example, the computingsystem 120 may access a textual term 106 in a resource 104. In someimplementations, the system obtains the textual term from a resourcestored at a remote computer. In some other implementations, the systemobtains the textual term from a search query.

The system determines a vector representing a phonetic feature of thetextual term (404). For example, the phonetic engine 108 may determinethe phonetic features of the textual term 106, and generates phoneticfeatures 110. The trained model 118 may determine a representationvector 120 based on the phonetic features 110.

In some implementations, the system determines the vector representingthe phonetic feature of the textual term by determining a pronunciationof the textual term. For example, the phonetic engine 108 may use apronunciation dictionary to predict the pronunciation of the textualterm 106. As another example, the phonetic engine 108 may use a set ofacoustic rules to predict the pronunciation of the textual term 106. Asanother example, the phonetic engine 108 may input the textual term 106to a trained acoustic model to generate a predicted pronunciation of thetextual term 106.

The system compares the vector representing the phonetic feature of thetextual term with a reference vector representing a phonetic feature ofa reference textual term (406). For example, the classifier engine 112may compare the representation vector 120 with one or more referencevectors in the vector space.

The system classifies the textual term based on the comparing the vectorwith the reference vector (408). In some implementations, the systemcompares the vector with the reference vector by determining a cosinedistance between the vector and the reference vector.

In some implementations, the system classifies the textual term bydetermining that the cosine distance is within a specific distance. Inresponse to determining that the cosine distance is within the specificdistance, the system classifies the textual term as being similar to thereference textual term. For example, a resource 104 may include atextual term 106 “Vehecle” that is an unintended spelling error of aknown textual term “Vehicle.” The classifier engine 112 may determinethat the representation vector of the textual term 106 “Vehecle” iswithin the specific vector distance to the reference vector of the knowntextual term “Vehicle,” and the classifier engine 112 may classify thatthe textual term 106 “Vehecle” is related to the known textual term“Vehicle” because their phonetic features (e.g., pronunciations) aresimilar.

In some implementations, the system classifies the textual term bydetermining a likelihood that the textual term is similar to thereference textual term. The system classifies the textual term based onthe likelihood at the textual term is similar to the reference textualterm. For example, the classified textual term 114 may be a word scorevector, where each field of the word score vector corresponds to arespective known textual term in a set of known textual terms. Eachfield of the word score vector may include a score indicating aprobability that the definition of the textual term 106 corresponds tothe definition of the respective known textual term. The classifierengine 112 may determine that the textual term 106 is similar to aparticular known textual term if the particular known textual term hasthe highest score in the word score vector.

In some implementations, the system classifies the textual term bydetermining that the textual term is similar to the reference textualterm. In response to determining that the textual term is similar to thereference textual term, associating a definition of the referencetextual term to the textual term.

The textual term can include multiple characters. The system determinesthe vector representing the phonetic feature of the textual term bydetermining a first vector representing a phonetic feature of a subsetof the characters of the textual term. For example, for the textual term“CompuVehicle”, the trained model 118 may determine a respectiverepresentation vector for a textual term “Compu.” The system classifiesthe textual term by determining that the subset of the characters of thetextual term is similar to the reference textual term based on thecomparing the vector with the reference vector. In response todetermining that the subset of the characters of the textual term issimilar to the reference textual term, the system associates adefinition of the reference textual term to the subset of the charactersof the textual term. For example, the classifier engine 112 maydetermine that the cosine distance between the representation vector ofthe separated textual term “Compu” is within the specific vectordistance with the reference vector of a known textual term “Computer,”and the classifier engine 112 may classify that the separated textualterm “Compu” is related to the known textual term “Computer.”

The system determines a second vector representing a phonetic feature ofa second subset of the characters of the textual term. For example, forthe textual term “CompuVehicle”, the trained model 118 may determine arespective representation vector for a textual term “Vehicle.” Thesystem compares the second vector with a second reference vectorrepresenting a phonetic feature of a second reference textual term. Thesystem determines that the second subset of the characters is similar tothe second reference textual term based on comparing the second vectorwith the second reference vector. In response to determining that thesecond subset of the characters is similar to the second referencetextual term, the system associates a definition of the second referencetextual term to the second subset of the characters of the textual term.For example, the classifier engine 112 may determine that therepresentation vector of the separated textual term “Vehicle” isnumerically equal to the reference vector of a known textual term“Vehicle,” and the classifier engine 112 may classify that the separatedtextual term “Vehicle” is equivalent of the known textual term“Vehicle.”

In some implementations, the system obtains one or more textual termsthat are surrounding the textual term. For example, the trained model218 obtains surrounding textual terms 204. The surrounding textual terms204 may be all terms within a specific number of terms adjacent to thetextual term 106. The surrounding textual terms 204 may be all termswithin a specific number of terms adjacent to the textual term 106,excluding specific terms (e.g., “a”, “an”, “the”, etc.). The systemdetermines the vector representing the phonetic feature of the textualterm by determining the vector using (i) the phonetic feature of thetextual term and (ii) the one or more textual terms that are surroundingthe textual term. For example, the trained model 218 may add one or moredimensions in the word-based vector space to represent the phoneticfeatures of textual terms. The trained model 218 may map the phoneticfeatures 110 to the additional dimensions, and the values in theadditional dimensions may be used by the classifier engine 212 toclassify the textual term 106.

A number of implementations have been described. Nevertheless, it willbe understood that various modifications may be made without departingfrom the spirit and scope of the disclosure. For example, various formsof the flows shown above may be used, with steps re-ordered, added, orremoved.

Embodiments and all of the functional operations described in thisspecification may be implemented in digital electronic circuitry, or incomputer software, firmware, or hardware, including the structuresdisclosed in this specification and their structural equivalents, or incombinations of one or more of them. Embodiments may be implemented asone or more computer program products, i.e., one or more modules ofcomputer program instructions encoded on a computer-readable medium forexecution by, or to control the operation of, data processing apparatus.The computer readable-medium may be a machine-readable storage device, amachine-readable storage substrate, a memory device, a composition ofmatter affecting a machine-readable propagated signal, or a combinationof one or more of them. The computer-readable medium may be anon-transitory computer-readable medium. The term “data processingapparatus” encompasses all apparatus, devices, and machines forprocessing data, including by way of example a programmable processor, acomputer, or multiple processors or computers. The apparatus mayinclude, in addition to hardware, code that creates an executionenvironment for the computer program in question, e.g., code thatconstitutes processor firmware, a protocol stack, a database managementsystem, an operating system, or a combination of one or more of them. Apropagated signal is an artificially generated signal, e.g., amachine-generated electrical, optical, or electromagnetic signal that isgenerated to encode information for transmission to suitable receiverapparatus.

A computer program (also known as a program, software, softwareapplication, script, or code) may be written in any form of programminglanguage, including compiled or interpreted languages, and it may bedeployed in any form, including as a standalone program or as a module,component, subroutine, or other unit suitable for use in a computingenvironment. A computer program does not necessarily correspond to afile in a file system. A program may be stored in a portion of a filethat holds other programs or data (e.g., one or more scripts stored in amarkup language document), in a single file dedicated to the program inquestion, or in multiple coordinated files (e.g., files that store oneor more modules, sub programs, or portions of code). A computer programmay be deployed to be executed on one computer or on multiple computersthat are located at one site or distributed across multiple sites andinterconnected by a communication network.

The processes and logic flows described in this specification may beperformed by one or more programmable processors executing one or morecomputer programs to perform functions by operating on input data andgenerating output. The processes and logic flows may also be performedby, and apparatus may also be implemented as, special purpose logiccircuitry, e.g., an FPGA (field programmable gate array) or an ASIC(application specific integrated circuit).

Processors suitable for the execution of a computer program include, byway of example, both general and special purpose microprocessors, andany one or more processors of any kind of digital computer. Generally, aprocessor will receive instructions and data from a read only memory ora random access memory or both. The essential elements of a computer area processor for performing instructions and one or more memory devicesfor storing instructions and data. Generally, a computer will alsoinclude, or be operatively coupled to receive data from or transfer datato, or both, one or more mass storage devices for storing data, e.g.,magnetic, magneto optical disks, or optical disks. However, a computerneed not have such devices. Moreover, a computer may be embedded inanother device, e.g., a tablet computer, a mobile telephone, a personaldigital assistant (PDA), a mobile audio player, a Global PositioningSystem (GPS) receiver, to name just a few. Computer readable mediasuitable for storing computer program instructions and data include allforms of non-volatile memory, media and memory devices, including by wayof example semiconductor memory devices, e.g., EPROM, EEPROM, and flashmemory devices; magnetic disks, e.g., internal hard disks or removabledisks; magneto optical disks; and CD-ROM and DVD-ROM disks. Theprocessor and the memory may be supplemented by, or incorporated in,special purpose logic circuitry.

To provide for interaction with a user, embodiments may be implementedon a computer having a display device, e.g., a CRT (cathode ray tube) orLCD (liquid crystal display) monitor, for displaying information to theuser and a keyboard and a pointing device, e.g., a mouse or a trackball,by which the user may provide input to the computer. Other kinds ofdevices may be used to provide for interaction with a user as well; forexample, feedback provided to the user may be any form of sensoryfeedback, e.g., visual feedback, auditory feedback, or tactile feedback;and input from the user may be received in any form, including acoustic,speech, or tactile input.

Embodiments may be implemented in a computing system that includes aback end component, e.g., as a data server, or that includes amiddleware component, e.g., an application server, or that includes afront end component, e.g., a client computer having a graphical userinterface or a Web browser through which a user may interact with animplementation of the techniques disclosed, or any combination of one ormore such back end, middleware, or front end components. The componentsof the system may be interconnected by any form or medium of digitaldata communication, e.g., a communication network. Examples ofcommunication networks include a local area network (“LAN”) and a widearea network (“WAN”), e.g., the Internet.

The computing system may include clients and servers. A client andserver are generally remote from each other and typically interactthrough a communication network. The relationship of client and serverarises by virtue of computer programs running on the respectivecomputers and having a client-server relationship to each other.

An “engine” (or “software engine”) refers to a software implementedinput/output system that provides an output that is different from theinput. An engine can be an encoded block of functionality, such as alibrary, a platform, a Software Development Kit (“SDK”), or an object.

While this specification contains many specifics, these should not beconstrued as limitations, but rather as descriptions of featuresspecific to particular embodiments. Certain features that are describedin this specification in the context of separate embodiments may also beimplemented in combination in a single embodiment. Conversely, variousfeatures that are described in the context of a single embodiment mayalso be implemented in multiple embodiments separately or in anysuitable subcombination. Moreover, although features may be describedabove as acting in certain combinations and even initially claimed assuch, one or more features from a claimed combination may in some casesbe excised from the combination, and the claimed combination may bedirected to a subcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particularorder, this should not be understood as requiring that such operationsbe performed in the particular order shown or in sequential order, orthat all illustrated operations be performed, to achieve desirableresults. In certain circumstances, multitasking and parallel processingmay be advantageous. Moreover, the separation of various systemcomponents in the embodiments described above should not be understoodas requiring such separation in all embodiments, and it should beunderstood that the described program components and systems maygenerally be integrated together in a single software product orpackaged into multiple software products.

Thus, particular embodiments have been described. Other embodiments arewithin the scope of the following claims. For example, the actionsrecited in the claims may be performed in a different order and stillachieve desirable results.

What is claimed is:
 1. A computer-implemented method, comprising:obtaining a textual term; determining, by one or more computers, avector representing a phonetic feature of the textual term; comparingthe vector representing the phonetic feature of the textual term with areference vector representing a phonetic feature of a reference textualterm; and classifying the textual term based on the comparing the vectorwith the reference vector.
 2. The method of claim 1, wherein obtainingthe textual term comprises obtaining the textual term from a resourcestored at a remote computer.
 3. The method of claim 1, wherein obtainingthe textual term comprises obtaining the textual term from a searchquery.
 4. The method of claim 1, wherein determining the vectorrepresenting the phonetic feature of the textual term comprisesdetermining a pronunciation of the textual term.
 5. The method of claim1, wherein the textual term includes a plurality of characters, whereindetermining the vector representing the phonetic feature of the textualterm comprises determining a first vector representing a phoneticfeature of a subset of the plurality of characters of the textual term,and wherein classifying the textual term comprises: determining that thesubset of the plurality of characters of the textual term is similar tothe reference textual term based on the comparing the vector with thereference vector; and in response to determining that the subset of theplurality of characters of the textual term is similar to the referencetextual term, associating a definition of the reference textual term tothe subset of the plurality of characters of the textual term.
 6. Themethod of claim 5, comprising: determining a second vector representinga phonetic feature of a second subset of the plurality of characters ofthe textual term; comparing the second vector with a second referencevector representing a phonetic feature of a second reference textualterm; determining that the second subset of the plurality of charactersis similar to the second reference textual term based on comparing thesecond vector with the second reference vector; and in response todetermining that the second subset of the plurality of characters issimilar to the second reference textual term, associating a definitionof the second reference textual term to the second subset of theplurality of characters of the textual term.
 7. The method of claim 1,wherein comparing the vector with the reference vector comprisesdetermining a cosine distance between the vector and the referencevector.
 8. The method of claim 7, wherein classifying the textual termcomprises: determining that the cosine distance is within a specificdistance; and in response to determining that the cosine distance iswithin the specific distance, classifying the textual term as beingsimilar to the reference textual term.
 9. The method of claim 1, whereinclassifying the textual term comprises determining a likelihood that thetextual term is similar to the reference textual term; and. classifyingthe textual term based on the likelihood at the textual term is similarto the reference textual term.
 10. The method of claim 1, whereinclassifying the textual term comprises: determining that the textualterm is similar to the reference textual term; and in response todetermining that the textual term is similar to the reference textualterm, associating a definition of the reference textual term to thetextual term.
 11. The method of claim 1, wherein obtaining the textualterm comprises obtaining one or more textual terms that are surroundingthe textual term, and wherein determining the vector representing thephonetic feature of the textual term comprises determining the vectorusing (i) the phonetic feature of the textual term and (ii) the one ormore textual terms that are surrounding the textual term.
 12. Acomputer-readable medium storing software having stored thereoninstructions, which, when executed by one or more computers, cause theone or more computers to perform operations of: obtaining a textualterm; determining, by one or more computers, a vector representing aphonetic feature of the textual term; comparing the vector representingthe phonetic feature of the textual term with a reference vectorrepresenting a phonetic feature of a reference textual term; andclassifying the textual term based on the comparing the vector with thereference vector.
 13. The computer-readable medium of claim 12, whereinthe textual term includes a plurality of characters, wherein determiningthe vector representing the phonetic feature of the textual termcomprises determining a first vector representing a phonetic feature ofa subset of the plurality of characters of the textual term, and whereinclassifying the textual term comprises: determining that the subset ofthe plurality of characters of the textual term is similar to thereference textual term based on the comparing the vector with thereference vector; and in response to determining that the subset of theplurality of characters of the textual term is similar to the referencetextual term, associating a definition of the reference textual term tothe subset of the plurality of characters of the textual term.
 14. Thecomputer-readable medium of claim 13, wherein the operations comprise:determining a second vector representing a phonetic feature of a secondsubset of the plurality of characters of the textual term; comparing thesecond vector with a second reference vector representing a phoneticfeature of a second reference textual term; determining that the secondsubset of the plurality of characters is similar to the second referencetextual term based on comparing the second vector with the secondreference vector; and in response to determining that the second subsetof the plurality of characters is similar to the second referencetextual term, associating a definition of the second reference textualterm to the second subset of the plurality of characters of the textualterm.
 15. The computer-readable medium of claim 12, wherein comparingthe vector with the reference vector comprises determining a cosinedistance between the vector and the reference vector.
 16. Thecomputer-readable medium of claim 12, wherein obtaining the textual termcomprises obtaining one or more textual terms that are surrounding thetextual term, and wherein determining the vector representing thephonetic feature of the textual term comprises determining the vectorusing (i) the phonetic feature of the textual term and (ii) the one ormore textual terms that are surrounding the textual term.
 17. A systemcomprising: one or more processors and one or more computer storagemedia storing instructions that are operable, when executed by the oneor more processors, to cause the one or more processors to performoperations comprising: obtaining a textual term; determining, by one ormore computers, a vector representing a phonetic feature of the textualterm; comparing the vector representing the phonetic feature of thetextual term with a reference vector representing a phonetic feature ofa reference textual term; and classifying the textual term based on thecomparing the vector with the reference vector.
 18. The system of claim17, wherein the textual term includes a plurality of characters, whereindetermining the vector representing the phonetic feature of the textualterm comprises determining a first vector representing a phoneticfeature of a subset of the plurality of characters of the textual term,and wherein classifying the textual term comprises: determining that thesubset of the plurality of characters of the textual term is similar tothe reference textual term based on the comparing the vector with thereference vector; and in response to determining that the subset of theplurality of characters of the textual term is similar to the referencetextual term, associating a definition of the reference textual term tothe subset of the plurality of characters of the textual term.
 19. Thesystem of claim 18, wherein the operations comprise: determining asecond vector representing a phonetic feature of a second subset of theplurality of characters of the textual term; comparing the second vectorwith a second reference vector representing a phonetic feature of asecond reference textual term; determining that the second subset of theplurality of characters is similar to the second reference textual termbased on comparing the second vector with the second reference vector;and in response to determining that the second subset of the pluralityof characters is similar to the second reference textual term,associating a definition of the second reference textual term to thesecond subset of the plurality of characters of the textual term. 20.The system of claim 17, wherein obtaining the textual term comprisesobtaining one or more textual terms that are surrounding the textualterm, and wherein determining the vector representing the phoneticfeature of the textual term comprises determining the vector using (i)the phonetic feature of the textual term and (ii) the one or moretextual terms that are surrounding the textual term.